What Is The Google Dance?
by Richard Zwicky


Whenever I'm at trade shows, run seminars, or speak at symposiums 
I am asked the question "what is the Google dance?" I've heard a 
few different theories regarding "the Google Dance", but only one 
is really correct. It's the period when Google is rebuilding its 
rankings, and results fluctuate widely for a 3 to 5 day period.

How Often Does The Google Dance Happen?
The name "Google Dance" is often used to describe the period 
when a major index update of the Google search engine is being 
implemented. These major Google index updates occur on average 
every 36 days - or 10 times per year. It can easily be identified 
by significant changes in search results, and by an updating of 
Google's cache of all indexed pages. These changes can be evident
from one minute to the next. But the update does not proceed as 
a change from one index to another like the flip of a switch. In 
fact, it takes several days to finish the complete update of the 
index. 

Because Google, like every other search engine, depends on their 
customers knowing that they deliver authoritative reliable 
results 24 hours of the day, seven days a week, updates pose a 
serious issue. They can't 'shut down for maintenance' and they 
cannot afford to go offline for even one minute. Hence, we have 
the Dance. Every search engine goes through it, some more or 
less often than Google. However, it is only because of Google's 
reach that we pay attention to its rebuild more than any other 
engines'

During this period, the index is constantly in flux, and search 
results can vary wildly, because it is also during the Dance that 
Google makes any algorithm adjustments live, and updates the 
PageRank and Back Links for each web site it has indexed. 

Do Search Results Only Change During The Google Dance?
No, in fact, during any month there will be minor changes in 
rankings. This is because Google's bot or spider is always 
running and finding new material. It also happens because the 
bot may have detected that a web site no longer exists and 
needs to be deleted from the index. During the Dance, the 
Googlebot will revisit every site, figure out how many sites 
link to it, and how many it links out to, and how valuable 
these links are.

Because Google is constantly crawling and updating selected 
pages, their search results will vary slightly over the course 
of the month. However, it is only during the Google Dance that 
these results can swing wildly. You also need to consider that 
Google has 8 data centers, sharing more than 10,000 servers. 
Somehow, the updates to the index that occur during the month, 
and outside of the Google Dance have to get transferred 
throughout. It's a constant process for Google and every other 
search engine. These ongoing, incremental updates only affect 
parts of the index at any one time.

Checking the Google Dance
You may know that Google has 8 main www servers online, which are 
as follows:

· www-ex.google.com - (where you get when you type www.google.com) 
· www-sj.google.com - (can also be accessed at www2.google.com) 
· www-va.google.com - (can also be accessed at www3.google.com) 
· www-dc.google.com 
· www-ab.google.com 
· www-in.google.com 
· www-zu.google.com 
· www-cw.google.com 

During the Google Dance, you can check the 8 Google servers, and
they will display sometime wildly differing results, thus they 
are said to be "dancing", and hence the name "Google Dance". 

The easiest way to check if the Google Dance is happening is to 
go to www.google.com, and do a search. Look at the blue bar at 
the top of the page. It will have the words "Results 1 - 10 of 
about 626,000. Search took 0.48 seconds". Now check the same 
search on www2.google.com, and www3.google.com. If you are seeing 
a different number of total pages for the same search, then the 
Google Dance is on. You can also check all the variations above. 
www2 is really www-sj, and www3 is www-va. We have found that 
all the others need their full www-extension.google.com in the 
url if you want to test them properly. Once the numbers, and the 
order of results on all 8 www's are the same, you know the dance 
is over.

Importance of The Google Dance
For most people, this event in and of itself is not important. 
However for anyone in the search engine optimization industry it 
is a period of note. First off, we always get lots of calls from 
clients during the Dance. Pages get temporarily dropped. 
Sometimes it lasts a day. People panic. Then when they are 
re-added,  they are better placed than before, and things calm 
down. It's interesting to see how overpoweringly important this 
one engine is. 

The Technical Background of the Google Dance
The Google search engine pulls its results from more than 10,000 
servers. This means that when you type a question or query into 
Google, that request is handled by one of 10,000 computers. 
Whichever server gets the query has to have an answer for you 
within a fraction of a second. Imagine putting all the books in 
the Library of Congress on the floor of an airplane hanger and 
then asking for 'sun tzu art of war', and expecting to be able 
to find the correct result in the blink of an eye. Impossible to 
imagine isn't it? Yet we ask the search engines to do this for 
us every day.   

Google uses Linux servers. When the rebuild happens, all 10,000 
of these servers are updated. Naturally, there will always be 
some variation from one index to the next - just because there 
always are new sites being added, and content changes being made 
that affect the placement of some websites. But during the Google 
Dance, these variations are dramatic. One server after the other 
is updated with portions of the new index, until eventually, 
they are all updated with a completely new index database. 

Google Dance and DNS 
Not only is Google's index spread over more than 10,000 servers, 
but also these servers are in eight different data centers. 
These data centers are mainly located in the U.S.  

Google uses multiple data centers to get results to the end user 
faster. If you access a data center that is physically close to 
you, then in theory, your connections need to make less hops - or 
navigate less intersections - to get to the data center and back. 
Each data center has its own IP address (numerical address on the 
internet) and the Domain Name System (DNS) system manages the way 
that these IP addresses are accessed. The system instantly routes 
your request to the nearest, or least congested data center. It's 
then routed within that data center facility to an idle server. 
In this way, Google is using a two step form of load balancing by 
its use of the DNS tables and then internalized traffic management. 
Therefore, the distance for data transmissions can be reduced and 
the speed of response improved. 

During the Google Dance period, all the servers in all the data 
centers cannot receive the new index at the same time. In fact, 
only portions of the new index can be transferred to each data 
center at one time, and each portion is transferred to one after 
the other. Different portions are uploaded to each server farm at 
different times, which also affects results. When a user queries 
Google during the Google Dance, they may get the results from a 
data center which still has all or part of the old index in place 
one minute and then data from a data center which has new data 
a few minutes later. From the users perspective, the change took 
place within seconds. 

Building up a completely new index every month or so can cause 
quite a bit of trouble. After all, the search engines have to 
spider and index billions of documents and then process the 
resulting data compiled into one cohesive unit. That's no small 
feat. 

During the period outside of the Dance, there may also be minor 
fluctuations in search results. This is because the index at the 
various data centers can never be identical to each other. New 
sites are constantly being added, old ones deleted, etc... It is 
estimated that over 8 million new web pages are created every day. 
Some of them are added to the search engines, and thus affect 
search results. 

Now, if you want Google's definition of the Google Dance visit 
their page about the Google Dance 
(http://www.google.com/googledance2002/). Looks like fun, I'd go!


================================================================
Richard Zwicky is a founder and the CEO of Metamend Software,  
(http://www.metamend.com) a Victoria B.C. based firm whose 
cutting edge Search Engine Optimization software has been 
recognized around the world as a leader in its field. Employing 
a staff of 10, the firm's business comes from around the world, 
with clients from every continent. Most recently the company was
recognized for their geo-locational, or LBS technology, which 
correlates online businesses with their physical locations. 
================================================================